{
 "cells": [
  {
   "cell_type": "markdown",
   "source": [
    "![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)\n",
    "\n",
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/ocr/ocr_form_relation_extractor.ipynb)\n",
    "\n",
    "[Tutorial Notebook](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/ocr/ocr_form_relation_extractor.ipynb \"https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/ocr/ocr_form_relation_extractor.ipynb\")\n"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "# **FormRelationExtractor**\n",
    "\n",
    "\n",
    "The **FormRelationExtractor** is a tool designed to identify the relationships between keys and values. It’s particularly useful in the context of data extracted by a Named Entity Recognition (NER) system, such as VisualDocumentNER.\n",
    "\n",
    "**All the available models:**\n",
    "\n",
    "| NLU Spell            | Transformer Class                                                                       |\n",
    "|----------------------|-----------------------------------------------------------------------------------------|\n",
    "| nlu.load(`visual_form_relation_extractor`) | [FormRelationExtractor](https://nlp.johnsnowlabs.com/docs/en/ocr_visual_document_understanding#formrelationextractor) |"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## **Starting the session**"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "from johnsnowlabs import nlp\n",
    "nlp.install(visual=True)\n",
    "nlp.start(visual=True)"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## **Form Relation Extraction**"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "🚨 Outdated Medical Secrets in license file. Version=5.3.1 but should be Version=5.1.1\n",
      "🚨 Outdated OCR Secrets in license file. Version=5.3.1 but should be Version=5.0.2\n",
      "📋 Loading license number 0 from C:\\Users\\gadde/.johnsnowlabs\\licenses/license_number_0_for_Spark-Healthcare_Spark-OCR.json\n",
      "👷 Trying to install compatible secrets. Use nlp.settings.enforce_versions=False if you want to install outdated secrets.\n",
      "👷 Trying to install compatible secrets. Use nlp.settings.enforce_versions=False if you want to install outdated secrets.\n",
      "👷 Setting up  John Snow Labs home in C:\\Users\\gadde/.johnsnowlabs, this might take a few minutes.\n",
      "Downloading 🫘+🚀 Java Library spark-nlp-assembly-5.1.1.jar\n",
      "🙆 JSL Home setup in C:\\Users\\gadde/.johnsnowlabs\n",
      "🤓 Looks like you are missing some jars, trying fetching them ...\n",
      "👷 Trying to install compatible secrets. Use nlp.settings.enforce_versions=False if you want to install outdated secrets.\n",
      "Downloading 🫘+💊 Java Library spark-nlp-jsl-5.1.1.jar\n",
      "Downloading 🫘+🕶 Java Library spark-ocr-assembly-5.0.2.jar\n",
      "🙆 JSL Home setup in C:\\Users\\gadde/.johnsnowlabs\n",
      "👷 Trying to install compatible secrets. Use nlp.settings.enforce_versions=False if you want to install outdated secrets.\n",
      "👌 Launched \u001B[92mcpu optimized\u001B[39m session with with: 🚀Spark-NLP==5.3.1, 💊Spark-Healthcare==5.1.1, 🕶Spark-OCR==5.0.2, running on ⚡ PySpark==3.1.2\n",
      "Warning::Spark Session already created, some configs may not take.\n",
      "Warning::Spark Session already created, some configs may not take.\n",
      "lilt_roberta_funsd_v1 download started this may take some time.\n",
      "Approximate size to download 419.6 MB\n"
     ]
    }
   ],
   "source": [
    "model = nlp.load('visual_form_relation_extractor')"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2024-05-13T08:27:37.781697200Z",
     "start_time": "2024-05-13T08:17:43.901075500Z"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Warning::Spark Session already created, some configs may not take.\n"
     ]
    }
   ],
   "source": [
    "! wget https://github.com/JohnSnowLabs/nlu/blob/master/tests/datasets/ocr/images/form.png\n",
    "! wget https://github.com/JohnSnowLabs/nlu/blob/master/tests/datasets/ocr/images/form2.jpg\n",
    "\n",
    "res = model.predict(['form.png','form2.jpg'])"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "outputs": [
    {
     "data": {
      "text/plain": "      form_relation_prediction_key   form_relation_prediction_value  \\\n0                         division                    allied health   \n0                           course                              hce   \n0                           number                              116   \n0                            title      calculations medical dosage   \n0                          credits                                2   \n0                     developed by                      dr . by taz   \n0  lecture / lab lecture / o ratio                                2   \n0                  course activity                               no   \n0                         cip code                        51 . 0800   \n0                         semester                         fall and   \n0                      ge category                             none   \n0                     separate lab                               no   \n0                 course awareness                               no   \n0                           course                               no   \n1                           name :                   dribbler , bbb   \n1                     study date :          12 - 09 - 2006 , 6 : 34   \n1                             bp :                    120 / 80 mmhg   \n1                            mrn :                   12341820060912   \n1               patient location :                             room   \n1                             hr :                          100 bpm   \n1                            dob :                   19 - 06 - 1979   \n1                         gender :                             male   \n1                         height :                           123 cm   \n1                            age :                         27 years   \n1                         weight :                            25 kg   \n1               reason for study :                               mi   \n1                            bsa :                         0 . 92 m   \n1                        history :                       asfgfdgsdg   \n1                    medications :           heparine , paracetamol   \n1                      performed .  the study technically limited .   \n1                                .                               no   \n\n                                                path  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n0  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  \n1  file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...  ",
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>form_relation_prediction_key</th>\n      <th>form_relation_prediction_value</th>\n      <th>path</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>division</td>\n      <td>allied health</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>0</th>\n      <td>course</td>\n      <td>hce</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>0</th>\n      <td>number</td>\n      <td>116</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>0</th>\n      <td>title</td>\n      <td>calculations medical dosage</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>0</th>\n      <td>credits</td>\n      <td>2</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>0</th>\n      <td>developed by</td>\n      <td>dr . by taz</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>0</th>\n      <td>lecture / lab lecture / o ratio</td>\n      <td>2</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>0</th>\n      <td>course activity</td>\n      <td>no</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>0</th>\n      <td>cip code</td>\n      <td>51 . 0800</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>0</th>\n      <td>semester</td>\n      <td>fall and</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>0</th>\n      <td>ge category</td>\n      <td>none</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>0</th>\n      <td>separate lab</td>\n      <td>no</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>0</th>\n      <td>course awareness</td>\n      <td>no</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>0</th>\n      <td>course</td>\n      <td>no</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>name :</td>\n      <td>dribbler , bbb</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>study date :</td>\n      <td>12 - 09 - 2006 , 6 : 34</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>bp :</td>\n      <td>120 / 80 mmhg</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>mrn :</td>\n      <td>12341820060912</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>patient location :</td>\n      <td>room</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>hr :</td>\n      <td>100 bpm</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>dob :</td>\n      <td>19 - 06 - 1979</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>gender :</td>\n      <td>male</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>height :</td>\n      <td>123 cm</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>age :</td>\n      <td>27 years</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>weight :</td>\n      <td>25 kg</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>reason for study :</td>\n      <td>mi</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>bsa :</td>\n      <td>0 . 92 m</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>history :</td>\n      <td>asfgfdgsdg</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>medications :</td>\n      <td>heparine , paracetamol</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>performed .</td>\n      <td>the study technically limited .</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>.</td>\n      <td>no</td>\n      <td>file:/F:/Work/repos/nlu_new/ner/nlu/examples/c...</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "res_filtered = res[['form_relation_prediction_key','form_relation_prediction_value','path']]\n",
    "res_filtered"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2024-05-13T08:40:51.701641600Z",
     "start_time": "2024-05-13T08:40:51.627215600Z"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [],
   "metadata": {
    "collapsed": false
   }
  }
 ],
 "metadata": {
  "kernelspec": {
   "name": "myenv",
   "language": "python",
   "display_name": "myenv"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
